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This  paper  deals  with  the  problem  of  selecting  the  best  population  through  the  sequen¬ 
tial  subset  selection  approach.  Based  on  the  modified  likelihood  ratio  of  the  probability 
density  function  of  some  invariant  sufficient  statistics,  a  sequential  subset  selection  pro¬ 
cedure  is  proposed.  When  the  procedure  terminates,  one  can  assert  with  a  guaranteed 
probability  P* ,  that  the  best  population  is  included  in  the  selected  subset  and  that  each 


selected  population  is  within  some  fixed  distance  from  the  best  population. 
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1.  Introduction 

Consider  the  problem  of  selecting  the  “best”  among  k  populations.  Suppose  that 
observations  can  be  obtained  from  the  k  populations  sequentially.  It  is  often  desirable 
to  terminate  sampling  from  a  population  as  soon  as  there  is  statistical  evidence  that  it 
is  not  the  best  population,  and  this  population  is  eliminated  from  further  consideration. 
Selection  through  sequential  comparison  with  elimination  provides  a  significant  advantage. 
To  achieve  a  certain  accuracy,  it  requires,  on  the  average,  substantially  fewer  samples  than 
the  fixed  sample  size  procedures. 

In  sequential  selection  and  ranking  procedures,  contributions  have  been  made  to  select 
the  best  population  by  using  the  indifference  zone  approach.  The  simplest  formulation  of 
the  indifference  zone  approach  is  the  situation  where  one  may  wish  to  select  only  a  single 
population  and  guarantee  with  a  prespecified  probability  that  the  selected  population  is 
the  best  population  provided  some  other  condition  on  the  parameters  is  satisfied,  usually 
an  indifference  zone.  However,  in  many  real  situations,  it  is  hard  or  not  always  possible 
to  specify  the  indifference  (preference)  zone  condition.  Thus,  a  reasonable  and  useful 
approach  is  to  derive  a  sequential  selection  procedure  to  select  a  small  subset  containing  the 
best  population.  However,  it  may  happen  that  a  poor  population  may  be  contained  in  the 
selected  subset.  Recently,  Hsu  (1981,  1982)  and  Hsu  and  Edwards  (1983)  studied  methods 
to  derive  simultaneous  upper  confidence  intervals  for  all  measures  of  separation  between 
the  unknown  best  population  and  each  (non-best)  population  under  the  location  model. 
This  motivates  us  to  study  selection  rules  such  that,  with  some  prespecified  guaranteed 
probability,  not  only  the  best  population  is  selected,  but  also,  each  selected  population  is 
very  close  to  the  best  population. 

In  this  paper,  some  sequential  subset  selection  procedures  achieving  the  goal  described 
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above  are  derived.  These  procedures  are  based  on  an  invariant  statistic  for  the  parameters 
of  interest.  We  consider  observations  from  each  pair  of  k  populations  and  perform  a 
modified  sequential  probability  ratio  test  (MSPRT)  based  on  the  invariant  statistics.  This 
is  done  simultaneously  for  all  pairs  of  populations  and  if  a  particular  MSPRT  terminates, 
then  an  appropriate  population  is  removed  from  the  set  of  contending  populations.  This  is 
continued  until  only  one  population  belongs  to  this  set  or  some  statistical  evidence  indicates 
that  all  the  populations  remaining  in  this  set  are  within  a  (small)  specified  distance  from 
the  unknown  best  population.  At  each  stage  these  procedures  also  provide  some  statistical 
inference  about  an  upper  bound  on  the  measure  of  separation  between  the  unknown  best 
population  and  each  remaining  population. 

2.  Formulation  of  the  Selection  Problem 

Let  JTi, . . .  .Jr*  represent  k(k  >  2)  populations  and  let  Xtn  denote  the  nth  observation 
from  population  tt,,  *  =  1, . . . ,  k.  It  is  assumed  that  the  observations  X,„,  i  =  1, . . . ,  k;  n  = 
1,2, .. .  are  independently  distributed.  Suppose  that  Xin  has  distribution  function  .F(z|0,) 
depending  on  some  unknown  parameter  0,  for  *  =  1  ...k.  Let  0  =  (0i,. . . ,  0  k)  and  let 
n  =  (0|0  =  (0i,. . .  ,  0*)}  be  the  parameter  space.  For  each  t  and  j,  let  0,/  =  0(0,-, 0}) 
be  a  measure  of  separation  between  tt,  and  irj  where  0(0,-,  0/)  as  a  function  of  0,  and  0/, 
is  increasing  (decreasing)  in  0j(0/)  when  0/(0,)  is  fixed,  and  satisfies  the  conditon  that 
0(0,0)  =  0o  for  all  0.  Define  0,-  =  min{0/}  and  0  =  max  0,-.  Population  n ,•  is  called  the 

/#»'  i<i<fc 

best  population  if  ir ,•  is  the  unique  population  such  taht  0,-  =  0.  If  more  than  one  population 
has  this  property,  one  of  them  is  tagged,  and  considered  as  the  best  population.  We  use 
( k )  to  denote  the  index  of  the  best  population  and  denote  the  best  population  by  7T(fc). 

Suppose  that  observations  from  the  k  populations  are  taken  sequentially.  The  selection 
procedure  will  depend  upon  the  observations  through  a  sequence  of  statistics  {Tij(n),n  > 


1},  which  are  defined  to  be  functions 

(2.1)  Tij(n)  =  Tn(Xi\, . . .  t  Xin\  Xj\, . . . ,  Xjn) 

of  the  first  n  observations  from  populations  tt,-  and  7Ty .  In  a  given  problem,  the  function  Tn  is 
chosen  so  as  to  indicate  a  measure  of  the  separation  between  the  populations  in  a  reasonable 
way.  Let  Tij(n)  =  (T,-y(l), . . .  ,  T,y(n)).  We  assume  that  T,y(n)  has  a  joint  probability 
density  gn(t,y(n)|^y)  depending  on  the  parameters  fl,  and  fly  only  through  A,y  =  A(fl,, fly). 
Usually,  Ti,W  ,  Tij (2), . . .,  are  chosen  so  that  it  is  both  a  sufficient  and  transitive  sequence 
and  also  invariant  sufficient  for  (see  Hall,  Wijsman  and  Ghosh  (1965)). 

We  assume  that  there  is  no  information  about  the  configuration  of  fl,-y’s,  1  <  *,j  < 
k,  i  j.  However,  we  desire  that  each  selected  population  should  not  be  far  from  the  best 
population.  Let  A,-^)  denote  the  measure  of  separation  from  the  population  7iy  to  the  best 
population  *•(*).  Then,  by  our  definition,  A,(*)  <  flo-  For  a  prespecified  value  A*  <  flo, 
population  ir is  said  to  be  good  if  A,-(fc)  >  6,  and  bad  otherwise.  Let  S  denote  the  selected 
subset  and  CS(A*)  denote  the  event  that  *■(*)  G  S  and  A,-(fc)  >  A*  for  all  7r,  6  S.  We  desire 
a  sequential  subset  selection  procedure  P  such  that 

(2.2)  Pg{CS(6*)\P}  >  P *  for  all  fl  G  0, 
where  P*(k~1  <  P*  <  1)  is  a  prespecified  probability  level. 

3.  Sequential  Selection  Procedure  P 

Let  h(')  be  a  monotonically  decreasing  function  such  that  /i(A,-y)  =  fly,-.  Let  fl*(<  flo) 
be  a  prespecified  value  used  to  specify  the  event  CS(A,).  Then  fl0  =  h(60)  <  h(fl«).  Let  flj 
be  a  value  such  that  flo  <  &\  <  h(A»).  Consider  the  likelihood  ratio  statistics 

P-l)  Liii  n,a)  =  ?“(^(n)l<l),(„>n0) 

9n{Tij{n)\a) 


v  vw  . 


where  a  <  So  and  no  is  some  positive  integer.  Hoel  (1971)  and  Gupta  and  Huang  (1975) 
have  used  the  statistics  -L,y(n, a),  n  >  no,  to  construct  sequential  selection  procedures 
where  no  is  the  initial  sampling  size  of  the  procedures.  For  simplicity,  we  assume  that 
no  =  1.  We  now  define  a  sequential  selection  procedure  P  as  follows: 

Let  So  =  {tti,  . . . ,  A-*}.  For  each  n  >  1,  define 

(3.2)  Sn  =  {*,•  G  Sn_i|Ly,(n,60)  <  ~  ^  for  all  ir y  G  S„_x  -  {*•,}}. 

That  is,  Sn  is  the  set  of  contending  popualtions  up  to  stage  n.  At  stage  n,  population 
ft *  €  Sn  is  labelled  as  good  if  L,y(n,d*)  >  for  all  ?ry  G  Sn  —  {tt,}.  Let  |Sn|  denote  the 

size  of  the  set  Sn.  The  procedure  terminates  if  either  |5n|  =  1  or  all  the  populations  in 
Sn  have  been  labelled  as  good.  In  either  case,  we  take  S  =  Sn;  otherwise,  we  go  to  next 
stage.  The  procedure  is  thus  continued. 


4.  Probability  of  a  Correct  Selection 

Let  0m(t|t(m  —  1), 6)  denote  the  conditional  probability  density  function  of  Ta  (m) 
given  Tij(m  —  1)  =  t(m  —  1),  and  let  Lij(n,a)  be  the  statistic  defined  in  (3.1).  Then,  the 
statistics  L»y(n,o),n  >  1,  can  be  rewritten  as: 

(4.1)  Lty(n,fl)  =  ft  gm(r,y(m)li\y(m-l),^)  , 

m=2  ym(T,y(m)|r,y(m  —  l),o) 

where  II  [  ]  =  1  if  n  =  1.  For  each  n  >  1,  let  Ta  (n)  denote  the  o-field  generated  by 

m=a 

T,y(n).  Then, 


Lemma  4.1 .  {Lty(n,tf,y),P^,  Tij(n),n  >  1}  forms  a  nonnegative  martingale  for  i  ^  j. 
Proof:  This  lemma  can  be  proved  by  a  direct  computation. 


Now,  let  E  and  IS*(1  <  *  <  k,t  ^  (fc))  be  the  events  as  defined  below: 


(4.2) 


(  E  =  {Lt(fc)(n,5i(fc))  <  ££  for  all  *,•  6  5„_i  -  {*(*)}  for  all  n  >  l}, 

l  Ee{  =  {LiW{n,6i{k))  >  for  some  n  >  1}. 

Then,  we  have  the  following  lemma: 

Lemma  4.2.  (a)  Pq{E ?}  <  ^sr  for  all  t  ^  (fc),0  G  H. 

(b)  Pfl{P}  >  P*  for  all  |  6  n. 

Proof:  Part  (a)  is  a  consequence  of  Lemma  4.1  and  a  lemma  of  Robbins  and  Sieg- 
mund  (1973).  For  the  proof  of  part  (b),  we  have 

P0{E}  >  1  -  Pe{  U  Et)  >  1  -  £  Pe{Et)  >  P\ 

~  -  •#(*)  •/(*)  - 

This  completes  the  proof  of  this  lemma. 

Now,  for  each  a  <  So  (the  value  of  a  is  chosen  so  that  the  joint  probability  den¬ 
sity  function  gn (fi}  (n)\a)  is  well  defined),  let  A,y(m,  a)  =  {L,y(m,  a)  <  *~p.  }.  In  the 
following,  we  also  assume  that  the  following  condition  is  satisfied. 

(4.3)  Condition  A:  D  A*y(m,6)  C  fl  A,y(m,  a)  for  all  n  >  1  for  b  <  a  <  So. 

The  implication  of  (4.3)  is  that  the  values  of  the  statistics  L,y(n,  a)  for  n  >  1,  never 
exceed  the  boundary  level  before  that  of  the  statistics  Li}(n,b),n  >  1  when  b  <  a  < 

So.  A  sufficient  condition  for  (4.3)  to  hold  is  that  A,y(n,6)  C  A,y(n,a)  for  all  n  >  1. 

For  each  n  >  l,jr,-,jry  G  5n_i,t  #  j,  define  P,y(n)  and  Z?,y(n )  as  follows: 


Bi^n)  =  jo  <  6o\Lij{n, a)  <  j , 


Dij[n)  = 


inf  Bij{n)  if  P»y(n)  #  <t>, 
S0  if  Bij(n)  =  (f>, 
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where  <f>  denotes  the  empty  set.  Also,  let  Du(n)  =  60. 

Under  Condition  A,  if  Ay(«)  <  <50,  then  L,y(n,a)  <  for  all  D,y(n)  <  a  <  60 

and  Lij(n,b)  >  *Sp.  for  all  b  <  P,y(n).  For  each  n  >  1,  if  ay  G  Sn-i,  define 

(4.6)  Di(n)  =  max  (  min  D,y(m)). 

If  ay  £  Sn-i,  let  n,  =  max{m|a,-  G  5m_i}  and  define  Di(n)  =  !>,(»*)• 

By  definition  of  Dj(n),  for  each  *  =  1, . . . ,  Jfc,  {Z),  (n)}  is  an  increasing  sequence  and 
bounded  above  by  50- 

Lemma  4.3.  Let  L,y(n,a),5n,D,(n)  and  the  event  E  be  as  defined  in  (3.1),  (3.2),  (4.6) 
and  (4.2),  respectively.  Then,  under  Condition  A, 

E  C  {*■(*)  G  S  and  >  Di(n)  for  all  ay  G  5n_j  for  all  n  >  l}. 

Proof:  Since  <  S0  for  all  »,  then,  under  Condition  A,  we  have 

fc  —  X 

E  ={Li(k)(n,SiW)  <  for  all  ay  €  S^-i  -  {ar(fc)}  for  all  n  >  1} 

k  —  1 

c{L,(fc)(n,60)  <  _  and  5,(fc)  >  £>,(*)  (n)  for  all  ay  G  S„_i  -  (a-(fc)} 

for  all  n  >  1} 

C{»T(*)  e  s  and  fyfc)  >  Di(k){n)  for  all  ay  G  5„_i  -  {*-(*)}  for  all  n  >  1} 

(4.7)  ={w(fc)  G  S  and  fy*)  >  D,(*)(n)  for  all  ay  G  S'n-i  for  all  n  >  1} 
c{a-(fcj  G  5  and  fy*)  >  min  Z?,y(n)  for  all  a y  G  Sn_i  for  all  n  >  1} 

*y€5»_i 

={a’(fc)  G  5  and  £,(*)  >  Di(n)  for  all  ay  G  Sn_i  for  all  n  >  l}. 


An  immediate  consequence  of  Lemmas  4.2  and  4.3  is:  Under  Condition  A, 

(4.8)  Pq{ a-(*)  G  S  and  6^)  >  Di{n)  for  all  ay  G  S’,,-!  for  all  n  >  1}  >  P*  for  6  G  fl. 


v 

* 
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fi 
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This  result  provides  a  sequential  confidence  region  inference,  with  confidence  level  at  least 
P* ,  as  follows:  Simultaneously,  at  each  stage  n,  the  best  population  is  not  eliminated  and 
the  separation  from  each  remaining  population  to  the  unknown  best  population,  say  ir <, 
is  not  less  than  Z),(n)  for  all  n  >  1.  Another  consequence  of  Lemma  4.2  and  Lemma  4.3 
is  that  when  the  selection  procedure  V  terminates,  the  event  CS(S»)  is  guaranteed  with 
probability  at  least  P* .  We  state  this  result  els  a  theorem  els  follows: 

Theorem  4.1.  Let  V  be  the  sequential  selection  procedure  defined  in  Section  3.  Also, 
suppose  that  the  Condition  A  in  (4.3)  holds.  Then, 

Pe{CS(6.) \V)  >  P *  for  all  SeQ, 

provided  that  the  procedure  V  terminates  with  probability  one. 

Proof:  Note  that  when  the  selection  procedure  V  terminates,  then  either  |S|  =  1  or  all  the 
populations  in  5  must  have  been  labelled  as  good  at  some  stage.  Let  N  be  the  stopping 
time  of  the  selection  procedure  V  and  when  |S|  >  2,  for  each  7r,  £  S,  let  N,  denote  the 
first  time  that  7 Tj  was  labelled  as  good.  Then,  Ll}(Nt,6+)  >  for  all  nj  £  SNi  -  { 7rt } . 

Under  Condition  A,  by  definition  of  Dij(n),  D.^TV.)  >  6,  for  all  n,  £  Syv,  -  {tt,}  and  thus, 
Di(k)(Ni )  >  6 ,  if  7T(*)  £  Sat,.  -  {7^}.  Also,  note  that  S  =  Syv  and  when  |S|  >2 ,Ni  <  N 
for  all  7Tj  £  S.  Now  from  (4.7), 

E  C  {^(fc)  e  &  and  f>i(k)  >  Di(k)(n)  for  all  7r,  £  5„_  1  —  {ttjie)}  for  all  n  >  1} 

C{7r(fc)  £  S  and  |S|  =  1}  U  {n^k)  G  S,  J5|  >  2 ,6l(fc)  >  Dj(fc)(iV,)  for  all  7r,  £  S  -  {7r(*)}} 
C{7r(jfc)  e  S  and  |5|  =  1}  U  {7r(fc)  e  5,  |5|  >  2 ,6l(k)  >  6 ,  for  all  tt,  €  5  -  {7r(fc)}} 

=CS(S .). 

Then,  by  Lemma  4.2,  we  have,  for  all  6  £  0, 
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P0{CS(6.) \V)  >  Pe{E}  >  P*. 
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trative  Example:  Selecting  the  Population  with  the  lamest  Normal  Mean 


Let  Jri,...,jrfc  be  k  populations  and  let  X,n  denote  the  nth  observation  taken  from 
population  x,-.  Assume  that  X{n  has  normal  distribution  with  an  unknown  mean  0,  and  a 
common  known  variance  a 2  =  l,t  =  1, . . . ,  fc.  Define  the  measure  of  separation  between 
tt,"  and  iTj  as  =  0,-  —  0y.  Then,  So  =  0  and  6  —  0(fc)  —  0(fc-x)  where  0(j)  <  . . .  <  0 (*)  are 
the  ordered  parameters  of  0t’s.  Thus,  the  population  with  the  largest  mean  is  considered 
as  the  best  population.  For  a  given  6*  >  0, 7r*  is  said  to  be  good  if  0(fc)  —  0,  <  6*  and  bad 
otherwise.  For  a  prespecified  probability  P*(k~ 1  <  P*  <  1),  we  wish  to  derive  a  sequential 
selection  procedure  such  that 


Pg{ 7r(fc)  £  &  and  0(*)  —  <  0*  for  all  iri  £  5}  >  P* 


for  all  0  €  fl. 


For  each  n  >  1,  define  T*y(n)  =  5,n  -  S,n,  where  5,n  =  E  Aim.  Let  =  -5*  and 

m=l 


let  0  <  Si  <6*.  Then, 


1°8  Lij (n,  0)  =  y(Sin  -  S;n)  - 


log  Li}(n,6t)  = 


tfi  +0* 


-  0?) 


(5,n  -  5Jn)  +  - 


In  order  to  apply  the  procedure  P  to  this  selection  problem,  we  need  to  make  sure 
that  this  procedure  terminates  with  probability  one. 

Lemma  5.1.  For  the  problem  of  selecting  the  population  with  the  largest  mean  among  k 
normal  populations  with  a  common  known  variance,  the  sequential  selection  procedure  P 
terminates  with  probability  one  if  0  <  Si  < 


. V/.’AV 


Proof:  It  suffices  to  show  that  for  any  two  populations,  say  7Tx  and  7r2,  with  probability 
one,  the  event  H,  that  either  one  of  them  will  be  eliminated  (in  comparison  with  the  other) 
or  both  of  them  are  labelled  as  good,  occurs.  Without  loss  of  generality,  we  assume  that 
6 1  >  &2- 

First  consider  the  case  that  0\  —  02  >  Define  Ni  =  min{n|Lx2(n,0)  >  fj-p- } •  By 
the  strong  law  of  large  numbers,  £  log  L i2(n,0)  — »  4^-(0x  —  02  —  ^-)  >  0  a.e.  as  n  — ►  00, 
while  i  log  \k~p-  — ♦  0  as  n  — »  00.  Hence,  Pg{N\  <  00}  —  1. 

Next,  consider  the  case,  Q  <  —  62  <  Define  Nij  =  min{n|LtJ(n,6*)  >  -—>'• } 

for  t,j  =  1,2,*  7^  3,  and  =  max(iVi2,  N21).  By  the  strong  law  of  large  numbers 
again,  £  log  Ti2(n,tf*)  — +  (0i  —  $2  +  6  y1  ) (£1  -f  6*)/2  >  0  a.e.  as  n  — ►  00,  and  £  log 
L21  («,£*)  — ►  ( 02  —0i+  S-f-6j-)(0i  +  £*)/2  >  Oa.e.  as  n  — ♦  00.  Hence,  Pg{NtJ  <  00}  =  1 
for  i,j  =  1,2,*  7^  j  and  so,  Pg{N2  <  00}  =  1. 

Finally,  one  can  observe  that  {N\  <  00}  U  {iV2  <  00}  C  H.  Thus,  based  on  the  above 
discussion,  we  have,  Pg{H}  >  Pg{Ni  <  00  or  N2  <  00}  =  1  for  all  0  G  0.  Hence  the  proof 
of  this  lemma  is  complete. 

Now,  to  guarantee  the  ^‘-condition  for  the  event  (75(6*),  from  Theorem  4.1,  it  suffices 
to  verify  the  Condition  A  given  in  (4.3).  This  can  be  easily  verified. 
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